Generation of word alternative pronunciations using weighted finite state transducers

نویسندگان

Sérgio Paulo

Luís C. Oliveira

چکیده

This paper describes a speech segmentation tool allowing alternative word pronunciations within a WFST framework. Two approaches to word pronunciation graph generation were developed and evaluated. The first approach is grapheme-based where each grapheme is converted into all the phones it can give rise to, in the form of a WFST. Word graphs are obtained by concatenating all grapheme WFSTs. In the second approach, a training corpus is used to find the different realizations of the syllable. This information is used to generate alternative syllable-level pronunciations, represented as WFSTs, that are concatenated to produce the word graphs. Both approaches were evaluated by aligning the phone sequence generated by each approach with the manually labelled phone sequence for all utterance in the corpus. This alignment was used for computing F-rate values for each phone. The syllable-based approach produced the best results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of Pronunciations on Oov Queries in Spoken Term Detection

This paper focusses on the effect of pronunciations for Out-ofVocabulary (OOV) query terms on the performance of a spoken term detection (STD) task. OOV terms, typically proper names or foreign language terms occur infrequently but are rich in information. The STD task returns relevant segments of speech that contain one or more of these OOV query terms. The STD system described in this paper i...

متن کامل

CLSP Research Note No. 48 A Weighted Finite State Transducer Translation Template Model for Statistical Machine Translation

We present a Weighted Finite State Transducer Translation Template Model for statistical machine translation. This is a source-channel model of translation inspired by the Alignment Template translation model. The model attempts to overcome the deficiencies of word-toword translation models by considering phrases rather than words as units of translation. The approach we describe allows us to i...

متن کامل

Statistical Method of Building Dialect Language Models for ASR Systems

This paper develops a new statistical method of building language models (LMs) of Japanese dialects for automatic speech recognition (ASR). One possible application is to recognize a variety of utterances in our daily lives. The most crucial problem in training language models for dialects is the shortage of linguistic corpora in dialects. Our solution is to transform linguistic corpora into di...

متن کامل

Use of Weighted Finite State Transducers inPart of Speech

This paper addresses issues in part of speech disambiguation using nite-state transducers and presents two main contributions to the eld. One of them is the use of nite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted nite-state transducers. Another contribution is the successful combination of techni...

متن کامل

A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation

We present a derivation of the alignment template model for statistical machine translation and an implementation of the model using weighted finite state transducers. The approach we describe allows us to implement each constituent distribution of the model as a weighted finite state transducer or acceptor. We show that bitext word alignment and translation under the model can be performed wit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Generation of word alternative pronunciations using weighted finite state transducers

نویسندگان

چکیده

منابع مشابه

Effect of Pronunciations on Oov Queries in Spoken Term Detection

CLSP Research Note No. 48 A Weighted Finite State Transducer Translation Template Model for Statistical Machine Translation

Statistical Method of Building Dialect Language Models for ASR Systems

Use of Weighted Finite State Transducers inPart of Speech

A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation

عنوان ژورنال:

اشتراک گذاری